{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "93ae9bad-b8cc-43de-ba7d-387e0155674c",
   "metadata": {},
   "source": [
    "# Building a Natively Multimodal RAG Pipeline (over a Slide Deck)\n",
    "\n",
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_parse/blob/main/examples/multimodal/multimodal_rag_slide_deck.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
    "\n",
    "In this cookbook we show you how to build a multimodal RAG pipeline over a slide deck, with text, tables, images, diagrams, and complex layouts.\n",
    "\n",
    "A gap of text-based RAG is that they struggle with purely text-based representations of complex documents. For instance, if a page contains a lot of images and diagrams, a text parser would need to rely on raw OCR to extract out text. You can also use a multimodal model (e.g. gpt-4o and up) to do text extraction, but this is inherently a lossy conversion.\n",
    "\n",
    "Instead a **native multimodal pipeline** stores both a text and image representation of a document chunk. They are indexed via embeddings (text or image), and during synthesis both text and image are directly fed to the multimodal model for synthesis.\n",
    "\n",
    "This can have the following advantages:\n",
    "- **Robustness**: This solution is more robust than a pure text or even a pure image-based approach. In a pure text RAG approach, the parsing piece can be lossy. In a pure image-based approach, multimodal OCR is not perfect and may lose out against text parsing for text-heavy documents.\n",
    "- **Cost Optimization**: You may choose to dynamically include text-only, or text + image depending on the content of the page.\n",
    "\n",
    "![mm_rag_diagram](./multimodal_rag_slide_deck_img.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "54e8d9a7-5036-4d32-818f-00b2e888521f",
   "metadata": {},
   "source": [
    "## Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "70ccdd53-e68a-4199-aacb-cfe71ad1ff0b",
   "metadata": {},
   "outputs": [],
   "source": [
    "import nest_asyncio\n",
    "\n",
    "nest_asyncio.apply()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "225c5556-a789-4386-a1ee-cce01dbeb6cf",
   "metadata": {},
   "source": [
    "### Setup Observability\n",
    "\n",
    "We setup an integration with LlamaTrace (integration with Arize).\n",
    "\n",
    "If you haven't already done so, make sure to create an account here: https://llamatrace.com/login. Then create an API key and put it in the `PHOENIX_API_KEY` variable below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0eabee1f-290a-4c85-b362-54f45c8559ae",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install -U llama-index-callbacks-arize-phoenix"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "aaeb245c-730b-4c34-ad68-708fdde0e6cb",
   "metadata": {},
   "outputs": [],
   "source": [
    "# setup Arize Phoenix for logging/observability\n",
    "import llama_index.core\n",
    "import os\n",
    "\n",
    "PHOENIX_API_KEY = \"<PHOENIX_API_KEY>\"\n",
    "os.environ[\"OTEL_EXPORTER_OTLP_HEADERS\"] = f\"api_key={PHOENIX_API_KEY}\"\n",
    "llama_index.core.set_global_handler(\n",
    "    \"arize_phoenix\", endpoint=\"https://llamatrace.com/v1/traces\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fbb362db-b1b1-4eea-be1a-b1f78b0779d7",
   "metadata": {},
   "source": [
    "### Load Data\n",
    "\n",
    "Here we load the [Conoco Phillips 2023 investor meeting slide deck](https://static.conocophillips.com/files/2023-conocophillips-aim-presentation.pdf)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8bce3407-a7d2-47e8-9eaf-ab297a94750c",
   "metadata": {},
   "outputs": [],
   "source": [
    "!mkdir data\n",
    "!mkdir data_images\n",
    "!wget \"https://static.conocophillips.com/files/2023-conocophillips-aim-presentation.pdf\" -O data/conocophillips.pdf"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "246ba6b0-51af-42f9-b1b2-8d3e721ef782",
   "metadata": {},
   "source": [
    "### Model Setup\n",
    "\n",
    "Setup models that will be used for downstream orchestration."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "16e2071d-bbc2-4707-8ae7-cb4e1fecafd3",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import Settings\n",
    "from llama_index.llms.openai import OpenAI\n",
    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
    "\n",
    "embed_model = OpenAIEmbedding(model=\"text-embedding-3-large\")\n",
    "llm = OpenAI(model=\"gpt-4o\")\n",
    "\n",
    "Settings.embed_model = embed_model\n",
    "Settings.llm = llm"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e3f6416f-f580-4722-aaa9-7f3500408547",
   "metadata": {},
   "source": [
    "## Use LlamaParse to Parse Text and Images\n",
    "\n",
    "In this example, use LlamaParse to parse both the text and images from the document.\n",
    "\n",
    "We parse out the text in two ways: \n",
    "- in regular `text` mode using our default text layout algorithm\n",
    "- in `markdown` mode using GPT-4o (`gpt4o_mode=True`). This also allows us to capture page screenshots"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "570089e5-238a-4dcc-af65-96e7393c2b4d",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_parse import LlamaParse\n",
    "\n",
    "\n",
    "parser_text = LlamaParse(result_type=\"text\")\n",
    "parser_gpt4o = LlamaParse(result_type=\"markdown\", gpt4o_mode=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ef82a985-4088-4bb7-9a21-0318e1b9207d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Parsing text...\n",
      "Started parsing the file under job_id 62f157a9-9ef9-4e5b-95ac-67093fa25800\n",
      "..........Parsing PDF file...\n",
      "Started parsing the file under job_id 1ddd5654-062b-4e19-b488-d66efc9c509d\n"
     ]
    }
   ],
   "source": [
    "print(f\"Parsing text...\")\n",
    "docs_text = parser_text.load_data(\"data/conocophillips.pdf\")\n",
    "print(f\"Parsing PDF file...\")\n",
    "md_json_objs = parser_gpt4o.get_json_result(\"data/conocophillips.pdf\")\n",
    "md_json_list = md_json_objs[0][\"pages\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5318fb7b-fe6a-4a8a-b82e-4ed7b4512c37",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "# Commitment to Disciplined Reinvestment Rate\n",
      "\n",
      "| Period       | Description                          | Reinvestment Rate | WTI Average |\n",
      "|--------------|--------------------------------------|-------------------|-------------|\n",
      "| 2012-2016    | Industry Growth Focus                | >100%             | ~$75/BBL    |\n",
      "| 2017-2022    | ConocoPhillips Strategy Reset        | <60%              | ~$63/BBL    |\n",
      "| 2023E        |                                      |                   | at $80/BBL  |\n",
      "| 2024-2028    | Disciplined Reinvestment Rate        | ~50%              | at $60/BBL  |\n",
      "| 2029-2032    |                                      | ~6% CFO CAGR      | at $60/BBL  |\n",
      "\n",
      "- **Historic Reinvestment Rate**: Gray bars\n",
      "- **Reinvestment Rate at $60/BBL WTI**: Blue bars\n",
      "- **Reinvestment Rate at $80/BBL WTI**: Dashed blue lines\n",
      "\n",
      "Reinvestment rate and cash from operations (CFO) are non-GAAP measures. Definitions and reconciliations are included in the Appendix.\n"
     ]
    }
   ],
   "source": [
    "print(md_json_list[10][\"md\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "eeadb16c-97eb-4622-9551-b34d7f90d72f",
   "metadata": {},
   "outputs": [],
   "source": [
    "image_dicts = parser_gpt4o.get_images(md_json_objs, download_path=\"data_images\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fd3e098b-0606-4429-b48d-d4fe0140fc0e",
   "metadata": {},
   "source": [
    "## Build Multimodal Index\n",
    "\n",
    "In this section we build the multimodal index over the parsed deck. \n",
    "\n",
    "We do this by creating **text** nodes from the document that contain metadata referencing the original image path.\n",
    "\n",
    "In this example we're indexing the text node for retrieval. The text node has a reference to both the parsed text as well as the image screenshot."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3aae2dee-9d85-4604-8a51-705d4db527f7",
   "metadata": {},
   "source": [
    "#### Get Text Nodes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "18c24174-05ce-417f-8dd2-79c3f375db03",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.schema import TextNode\n",
    "from typing import Optional"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8e331dfe-a627-4e23-8c57-70ab1d9342e4",
   "metadata": {},
   "outputs": [],
   "source": [
    "# get pages loaded through llamaparse\n",
    "import re\n",
    "\n",
    "\n",
    "def get_page_number(file_name):\n",
    "    match = re.search(r\"-page-(\\d+)\\.jpg$\", str(file_name))\n",
    "    if match:\n",
    "        return int(match.group(1))\n",
    "    return 0\n",
    "\n",
    "\n",
    "def _get_sorted_image_files(image_dir):\n",
    "    \"\"\"Get image files sorted by page.\"\"\"\n",
    "    raw_files = [f for f in list(Path(image_dir).iterdir()) if f.is_file()]\n",
    "    sorted_files = sorted(raw_files, key=get_page_number)\n",
    "    return sorted_files"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "346fe5ef-171e-4a54-9084-7a7805103a13",
   "metadata": {},
   "outputs": [],
   "source": [
    "from copy import deepcopy\n",
    "from pathlib import Path\n",
    "\n",
    "\n",
    "# attach image metadata to the text nodes\n",
    "def get_text_nodes(docs, image_dir=None, json_dicts=None):\n",
    "    \"\"\"Split docs into nodes, by separator.\"\"\"\n",
    "    nodes = []\n",
    "\n",
    "    image_files = _get_sorted_image_files(image_dir) if image_dir is not None else None\n",
    "    md_texts = [d[\"md\"] for d in json_dicts] if json_dicts is not None else None\n",
    "\n",
    "    doc_chunks = [c for d in docs for c in d.text.split(\"---\")]\n",
    "    for idx, doc_chunk in enumerate(doc_chunks):\n",
    "        chunk_metadata = {\"page_num\": idx + 1}\n",
    "        if image_files is not None:\n",
    "            image_file = image_files[idx]\n",
    "            chunk_metadata[\"image_path\"] = str(image_file)\n",
    "        if md_texts is not None:\n",
    "            chunk_metadata[\"parsed_text_markdown\"] = md_texts[idx]\n",
    "        chunk_metadata[\"parsed_text\"] = doc_chunk\n",
    "        node = TextNode(\n",
    "            text=\"\",\n",
    "            metadata=chunk_metadata,\n",
    "        )\n",
    "        nodes.append(node)\n",
    "\n",
    "    return nodes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f591669c-5a8e-491d-9cef-0b754abbf26f",
   "metadata": {},
   "outputs": [],
   "source": [
    "# this will split into pages\n",
    "text_nodes = get_text_nodes(docs_text, image_dir=\"data_images\", json_dicts=md_json_list)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "32c13950-c1db-435f-b5b4-89d62b8b7744",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "page_num: 11\n",
      "image_path: data_images/1ddd5654-062b-4e19-b488-d66efc9c509d-page_39.jpg\n",
      "parsed_text_markdown: # Commitment to Disciplined Reinvestment Rate\n",
      "\n",
      "| Period       | Description                          | Reinvestment Rate | WTI Average |\n",
      "|--------------|--------------------------------------|-------------------|-------------|\n",
      "| 2012-2016    | Industry Growth Focus                | >100%             | ~$75/BBL    |\n",
      "| 2017-2022    | ConocoPhillips Strategy Reset        | <60%              | ~$63/BBL    |\n",
      "| 2023E        |                                      |                   | at $80/BBL  |\n",
      "| 2024-2028    | Disciplined Reinvestment Rate        | ~50%              | at $60/BBL  |\n",
      "| 2029-2032    |                                      | ~6% CFO CAGR      | at $60/BBL  |\n",
      "\n",
      "- **Historic Reinvestment Rate**: Gray bars\n",
      "- **Reinvestment Rate at $60/BBL WTI**: Blue bars\n",
      "- **Reinvestment Rate at $80/BBL WTI**: Dashed blue lines\n",
      "\n",
      "Reinvestment rate and cash from operations (CFO) are non-GAAP measures. Definitions and reconciliations are included in the Appendix.\n",
      "parsed_text: Commitment to Disciplined Reinvestment Rate\n",
      "                         Industry                    ConocoPhillips\n",
      "                                                     Strategy Reset                   Disciplined Reinvestment Rate is the Foundation for Superior\n",
      "                      Growth Focus                                                    Returns on and of Capital, while Driving Durable CFO Growth\n",
      "                             100%                           <60%                                        50%                 6%         at $60/BBL WTI\n",
      "                       Reinvestment Rate               Reinvestment Rate                          Reinvestment Rate10-YearCFO CAGR          Planning PriceMid-Cycle\n",
      "                                                                                                                         2024-2032\n",
      "    2   100%\n",
      "    1    75%\n",
      "    1    50%\n",
      "    1                                                                                                                                         WTIat $80/BBL                at S80/BBL\n",
      "         25%                'S75/BBL                        $63/BBL                                                                                                        WTI\n",
      "                              WTI                             WTI                         at S80/BBL                       at S60/BBL                      at S60/BBL\n",
      "                           Average                         Average                            WTI                              WTI                             WTI\n",
      "          0%\n",
      "                          2012-2016                        2017-2022                         2023E                          2024-2028                       2029-2032\n",
      "                                    Historic Reinvestment Rate            Reinvestment Rate at $60/BBL WTI                 Reinvestment Rate at $80/BBL WTI\n",
      " Reinvestment rate and cash from operations (CFO) are non-GAAP measures: Definitions and reconciliations are included in the Appendix                                  ConocoPhillips\n"
     ]
    }
   ],
   "source": [
    "print(text_nodes[10].get_content(metadata_mode=\"all\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4f404f56-db1e-4ed7-9ba1-ead763546348",
   "metadata": {},
   "source": [
    "#### Build Index\n",
    "\n",
    "Once the text nodes are ready, we feed into our vector store index abstraction, which will index these nodes into a simple in-memory vector store (of course, you should definitely check out our 40+ vector store integrations!)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6ea53c31-0e38-421c-8d9b-0e3adaa1677e",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/jerryliu/Programming/gpt_index/.venv/lib/python3.10/site-packages/tiktoken/core.py:50: RuntimeWarning: coroutine 'LlamaParse.aload_data' was never awaited\n",
      "  self._core_bpe = _tiktoken.CoreBPE(mergeable_ranks, special_tokens, pat_str)\n",
      "RuntimeWarning: Enable tracemalloc to get the object allocation traceback\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "from llama_index.core import (\n",
    "    StorageContext,\n",
    "    VectorStoreIndex,\n",
    "    load_index_from_storage,\n",
    ")\n",
    "\n",
    "if not os.path.exists(\"storage_nodes\"):\n",
    "    index = VectorStoreIndex(text_nodes, embed_model=embed_model)\n",
    "    # save index to disk\n",
    "    index.set_index_id(\"vector_index\")\n",
    "    index.storage_context.persist(\"./storage_nodes\")\n",
    "else:\n",
    "    # rebuild storage context\n",
    "    storage_context = StorageContext.from_defaults(persist_dir=\"storage_nodes\")\n",
    "    # load index\n",
    "    index = load_index_from_storage(storage_context, index_id=\"vector_index\")\n",
    "\n",
    "retriever = index.as_retriever()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5f0e33a4-9422-498d-87ee-d917bdf74d80",
   "metadata": {},
   "source": [
    "## Build Multimodal Query Engine\n",
    "\n",
    "We now use LlamaIndex abstractions to build a **custom query engine**. In contrast to a standard RAG query engine that will retrieve the text node and only put that into the prompt (response synthesis module), this custom query engine will also load the image document, and put both the text and image document into the response synthesis module."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "35a94be2-e289-41a6-92e4-d3cb428fb0c8",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.query_engine import CustomQueryEngine, SimpleMultiModalQueryEngine\n",
    "from llama_index.core.retrievers import BaseRetriever\n",
    "from llama_index.multi_modal_llms.openai import OpenAIMultiModal\n",
    "from llama_index.core.schema import ImageNode, NodeWithScore, MetadataMode\n",
    "from llama_index.core.prompts import PromptTemplate\n",
    "from llama_index.core.base.response.schema import Response\n",
    "from typing import Optional\n",
    "\n",
    "\n",
    "gpt_4o = OpenAIMultiModal(model=\"gpt-4o\", max_new_tokens=4096)\n",
    "\n",
    "QA_PROMPT_TMPL = \"\"\"\\\n",
    "Below we give parsed text from slides in two different formats, as well as the image.\n",
    "\n",
    "We parse the text in both 'markdown' mode as well as 'raw text' mode. Markdown mode attempts \\\n",
    "to convert relevant diagrams into tables, whereas raw text tries to maintain the rough spatial \\\n",
    "layout of the text.\n",
    "\n",
    "Use the image information first and foremost. ONLY use the text/markdown information \n",
    "if you can't understand the image.\n",
    "\n",
    "---------------------\n",
    "{context_str}\n",
    "---------------------\n",
    "Given the context information and not prior knowledge, answer the query. Explain whether you got the answer\n",
    "from the parsed markdown or raw text or image, and if there's discrepancies, and your reasoning for the final answer.\n",
    "\n",
    "Query: {query_str}\n",
    "Answer: \"\"\"\n",
    "\n",
    "QA_PROMPT = PromptTemplate(QA_PROMPT_TMPL)\n",
    "\n",
    "\n",
    "class MultimodalQueryEngine(CustomQueryEngine):\n",
    "    \"\"\"Custom multimodal Query Engine.\n",
    "\n",
    "    Takes in a retriever to retrieve a set of document nodes.\n",
    "    Also takes in a prompt template and multimodal model.\n",
    "\n",
    "    \"\"\"\n",
    "\n",
    "    qa_prompt: PromptTemplate\n",
    "    retriever: BaseRetriever\n",
    "    multi_modal_llm: OpenAIMultiModal\n",
    "\n",
    "    def __init__(self, qa_prompt: Optional[PromptTemplate] = None, **kwargs) -> None:\n",
    "        \"\"\"Initialize.\"\"\"\n",
    "        super().__init__(qa_prompt=qa_prompt or QA_PROMPT, **kwargs)\n",
    "\n",
    "    def custom_query(self, query_str: str):\n",
    "        # retrieve text nodes\n",
    "        nodes = self.retriever.retrieve(query_str)\n",
    "        # create ImageNode items from text nodes\n",
    "        image_nodes = [\n",
    "            NodeWithScore(node=ImageNode(image_path=n.metadata[\"image_path\"]))\n",
    "            for n in nodes\n",
    "        ]\n",
    "\n",
    "        # create context string from text nodes, dump into the prompt\n",
    "        context_str = \"\\n\\n\".join(\n",
    "            [r.get_content(metadata_mode=MetadataMode.LLM) for r in nodes]\n",
    "        )\n",
    "        fmt_prompt = self.qa_prompt.format(context_str=context_str, query_str=query_str)\n",
    "\n",
    "        # synthesize an answer from formatted text and images\n",
    "        llm_response = self.multi_modal_llm.complete(\n",
    "            prompt=fmt_prompt,\n",
    "            image_documents=[image_node.node for image_node in image_nodes],\n",
    "        )\n",
    "        return Response(\n",
    "            response=str(llm_response),\n",
    "            source_nodes=nodes,\n",
    "            metadata={\"text_nodes\": text_nodes, \"image_nodes\": image_nodes},\n",
    "        )\n",
    "\n",
    "        return response"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0890be59-fb12-4bb5-959b-b2d9600f7774",
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine = MultimodalQueryEngine(\n",
    "    retriever=index.as_retriever(similarity_top_k=9), multi_modal_llm=gpt_4o\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a92aa4f1-7501-4711-b054-f02338e54e74",
   "metadata": {},
   "source": [
    "### Define Baseline\n",
    "\n",
    "In addition, we define a \"baseline\" where we rely only on text-based indexing. Here we define an index using only the nodes that are parsed in text-mode from LlamaParse. \n",
    "\n",
    "**NOTE**: We don't currently include the markdown-parsed text because that was parsed with GPT-4o, so already uses a multimodal model during the text extraction phase.\n",
    "\n",
    "It is of course a valid experiment to compare RAG where multimodal extraction only happens during indexing, vs. the current multimodal RAG implementation where images are fed during synthesis to the LLM. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c0b15a48-d177-4666-aec2-98ee90664642",
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_nodes(docs):\n",
    "    \"\"\"Split docs into nodes, by separator.\"\"\"\n",
    "    nodes = []\n",
    "    for doc in docs:\n",
    "        doc_chunks = doc.text.split(\"\\n---\\n\")\n",
    "        for doc_chunk in doc_chunks:\n",
    "            node = TextNode(\n",
    "                text=doc_chunk,\n",
    "                metadata=deepcopy(doc.metadata),\n",
    "            )\n",
    "            nodes.append(node)\n",
    "\n",
    "    return nodes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2065d2c6-d6ba-4ee3-8e9e-dbc83cbcec1b",
   "metadata": {},
   "outputs": [],
   "source": [
    "base_nodes = get_nodes(docs_text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bcaea1a8-26c9-4385-8f62-32855aa898b6",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Our Differentiated Portfolio: Deep; Durable and Diverse\n",
      "                              20 BBOE of Resource                                           Diverse Production Base\n",
      "                            Under $40/BBL Cost of Supply                              10-Year Plan Cumulative Production (BBOE)\n",
      "      S50                   S32/BBL                                                Lower 48                           Alaska\n",
      "                    Average Cost of Supply\n",
      "  3 $40                                                                                                                       GKA        GWA\n",
      "                                                                                                                      GPA     WNS\n",
      "      $30                                                                                                             EMENA\n",
      "  3                                                                                                                              Norway\n",
      "  8   $20\n",
      "  E                                                                                                                   Qatar      Libya\n",
      "                                                                                                                      Asia Pacific Canada\n",
      "      $10                                                                          Permian\n",
      "                                                                                                                      APLNG        Montney\n",
      "       S0\n",
      "                                               10              15              20                         Bakken\n",
      "                                         Resource (BBOE)                           Eagle Ford             Other       Malaysia ChinaSurmont\n",
      "                 Lower 48      Canada       Alaska      EMENA    Asia Pacific\n",
      "Costs assumemid-cycle price environment of S60/BBL WTI:\n",
      "                                                                                                                                           ConocoPhillips\n"
     ]
    }
   ],
   "source": [
    "print(base_nodes[13].get_content(metadata_mode=\"all\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f6bcfbc6-4e9b-41ad-ad81-1c4245b95cd5",
   "metadata": {},
   "outputs": [],
   "source": [
    "base_index = VectorStoreIndex(base_nodes, embed_model=embed_model)\n",
    "base_query_engine = base_index.as_query_engine(llm=llm, similarity_top_k=9)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1f94ef26-0df5-4468-a156-903d686f02ce",
   "metadata": {},
   "source": [
    "## Build a Multimodal Agent\n",
    "\n",
    "Build an agent around the multimodal query engine. This gives you agent capabilities like query planning/decomposition and memory around a central QA interface."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5b7a8c5f-39fc-4d04-8c56-3642f5718437",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.tools import QueryEngineTool\n",
    "from llama_index.core.agent import FunctionCallingAgentWorker\n",
    "\n",
    "\n",
    "vector_tool = QueryEngineTool.from_defaults(\n",
    "    query_engine=query_engine,\n",
    "    name=\"vector_tool\",\n",
    "    description=(\n",
    "        \"Useful for retrieving specific context from the data. Do NOT select if question asks for a summary of the data.\"\n",
    "    ),\n",
    ")\n",
    "agent = FunctionCallingAgentWorker.from_tools(\n",
    "    [vector_tool], llm=llm, verbose=True\n",
    ").as_agent()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2b4f7eb1-d247-45fa-bb41-c02fc353a22a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# define a similar agent for the baseline\n",
    "base_vector_tool = QueryEngineTool.from_defaults(\n",
    "    query_engine=base_query_engine,\n",
    "    name=\"vector_tool\",\n",
    "    description=(\n",
    "        \"Useful for retrieving specific context from the data. Do NOT select if question asks for a summary of the data.\"\n",
    "    ),\n",
    ")\n",
    "base_agent = FunctionCallingAgentWorker.from_tools(\n",
    "    [base_vector_tool], llm=llm, verbose=True\n",
    ").as_agent()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2336f98b-c0a1-413a-849d-8a89bacb90b5",
   "metadata": {},
   "source": [
    "## Try out Queries\n",
    "\n",
    "Let's try out queries against these documents and compare against each other."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d78e53cf-35cb-4ef8-b03e-1b47ba15ae64",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Added user message to memory: Tell me about the diverse geographies where Conoco Phillips has a production base\n",
      "=== Calling Function ===\n",
      "Calling function: vector_tool with args: {\"input\": \"Conoco Phillips production base geographies\"}\n",
      "=== Function Output ===\n",
      "ConocoPhillips' production base geographies include:\n",
      "\n",
      "1. **Lower 48** (Permian, Eagle Ford, Bakken, Other)\n",
      "2. **Alaska** (GKA, GWA, GPA, WNS)\n",
      "3. **EMENA** (Norway, Libya, Qatar)\n",
      "4. **Asia Pacific** (APLNG, Malaysia, China)\n",
      "5. **Canada** (Montney, Surmont)\n",
      "\n",
      "This information was derived from the image on page 14, which provides a detailed breakdown of the diverse production base and the regions involved. The parsed markdown and raw text also support this information, but the image provides the clearest and most comprehensive view. There are no discrepancies between the image and the parsed text in this case.\n",
      "=== LLM Response ===\n",
      "ConocoPhillips has a diverse production base spread across various geographies, including:\n",
      "\n",
      "1. **Lower 48**:\n",
      "   - Permian Basin\n",
      "   - Eagle Ford\n",
      "   - Bakken\n",
      "   - Other regions within the continental United States\n",
      "\n",
      "2. **Alaska**:\n",
      "   - Greater Kuparuk Area (GKA)\n",
      "   - Greater Prudhoe Area (GPA)\n",
      "   - Greater Willow Area (GWA)\n",
      "   - Western North Slope (WNS)\n",
      "\n",
      "3. **EMENA (Europe, Middle East, and North Africa)**:\n",
      "   - Norway\n",
      "   - Libya\n",
      "   - Qatar\n",
      "\n",
      "4. **Asia Pacific**:\n",
      "   - Australia Pacific LNG (APLNG)\n",
      "   - Malaysia\n",
      "   - China\n",
      "\n",
      "5. **Canada**:\n",
      "   - Montney\n",
      "   - Surmont\n",
      "\n",
      "These regions highlight the global reach and diverse geographical footprint of ConocoPhillips' production operations.\n",
      "Added user message to memory: Tell me about the diverse geographies where Conoco Phillips has a production base\n",
      "=== Calling Function ===\n",
      "Calling function: vector_tool with args: {\"input\": \"diverse geographies where Conoco Phillips has a production base\"}\n",
      "=== Function Output ===\n",
      "ConocoPhillips has a diverse production base that includes the Lower 48 (Permian, Bakken, Eagle Ford), Alaska, Canada (Montney, Surmont), EMENA (Norway, Libya), Asia Pacific (Malaysia, China, APLNG), and Qatar.\n",
      "=== LLM Response ===\n",
      "ConocoPhillips has a diverse production base spanning several key geographies:\n",
      "\n",
      "1. **Lower 48 (United States)**: This includes major production areas such as the Permian Basin, Bakken Formation, and Eagle Ford Shale.\n",
      "2. **Alaska**: Significant operations in the North Slope region.\n",
      "3. **Canada**: Operations in the Montney Formation and the Surmont oil sands project.\n",
      "4. **EMENA (Europe, Middle East, and North Africa)**: Notable operations in Norway and Libya.\n",
      "5. **Asia Pacific**: Includes operations in Malaysia, China, and the Australia Pacific LNG (APLNG) project.\n",
      "6. **Qatar**: Involvement in the country's energy sector.\n",
      "\n",
      "These regions highlight the company's extensive and varied geographical footprint in the energy production industry.\n"
     ]
    }
   ],
   "source": [
    "query = (\n",
    "    \"Tell me about the diverse geographies where Conoco Phillips has a production base\"\n",
    ")\n",
    "response = agent.query(query)\n",
    "base_response = base_agent.query(query)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "355d2aa4-c26f-480e-b512-4446acbd9227",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ConocoPhillips has a diverse production base spread across various geographies, including:\n",
      "\n",
      "1. **Lower 48**:\n",
      "   - Permian Basin\n",
      "   - Eagle Ford\n",
      "   - Bakken\n",
      "   - Other regions within the continental United States\n",
      "\n",
      "2. **Alaska**:\n",
      "   - Greater Kuparuk Area (GKA)\n",
      "   - Greater Prudhoe Area (GPA)\n",
      "   - Greater Willow Area (GWA)\n",
      "   - Western North Slope (WNS)\n",
      "\n",
      "3. **EMENA (Europe, Middle East, and North Africa)**:\n",
      "   - Norway\n",
      "   - Libya\n",
      "   - Qatar\n",
      "\n",
      "4. **Asia Pacific**:\n",
      "   - Australia Pacific LNG (APLNG)\n",
      "   - Malaysia\n",
      "   - China\n",
      "\n",
      "5. **Canada**:\n",
      "   - Montney\n",
      "   - Surmont\n",
      "\n",
      "These regions highlight the global reach and diverse geographical footprint of ConocoPhillips' production operations.\n"
     ]
    }
   ],
   "source": [
    "print(str(response))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d584c560-8f49-4c10-a4db-2e0d3b7085d2",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "page_num: 14\n",
      "image_path: data_images/1ddd5654-062b-4e19-b488-d66efc9c509d-page_12.jpg\n",
      "parsed_text_markdown: # Our Differentiated Portfolio: Deep, Durable and Diverse\n",
      "\n",
      "## ~20 BBOE of Resource\n",
      "Under $40/BBL Cost of Supply\n",
      "\n",
      "### ~ $32/BBL\n",
      "Average Cost of Supply\n",
      "\n",
      "### WTI Cost of Supply ($/BBL)\n",
      "\n",
      "| Cost ($/BBL) | Resource (BBOE) |\n",
      "|--------------|-----------------|\n",
      "| $0           | 0               |\n",
      "| $10          |                 |\n",
      "| $20          |                 |\n",
      "| $30          |                 |\n",
      "| $40          |                 |\n",
      "| $50          |                 |\n",
      "\n",
      "- **Legend:**\n",
      "  - Lower 48\n",
      "  - Canada\n",
      "  - Alaska\n",
      "  - EMENA\n",
      "  - Asia Pacific\n",
      "\n",
      "*Costs assume a mid-cycle price environment of $60/BBL WTI.*\n",
      "\n",
      "## Diverse Production Base\n",
      "10-Year Plan Cumulative Production (BBOE)\n",
      "\n",
      "| Region       | Sub-region      |\n",
      "|--------------|-----------------|\n",
      "| Lower 48     | Permian         |\n",
      "|              | Eagle Ford      |\n",
      "|              | Bakken          |\n",
      "|              | Other           |\n",
      "| Alaska       | GKA             |\n",
      "|              | GWA             |\n",
      "|              | GPA             |\n",
      "|              | WNS             |\n",
      "| EMENA        | Norway          |\n",
      "|              | Libya           |\n",
      "|              | Qatar           |\n",
      "| Asia Pacific | APLNG           |\n",
      "|              | Malaysia        |\n",
      "|              | China           |\n",
      "| Canada       | Montney         |\n",
      "|              | Surmont         |\n",
      "parsed_text: Our Differentiated Portfolio: Deep; Durable and Diverse\n",
      "                              20 BBOE of Resource                                           Diverse Production Base\n",
      "                            Under $40/BBL Cost of Supply                              10-Year Plan Cumulative Production (BBOE)\n",
      "      S50                   S32/BBL                                                Lower 48                           Alaska\n",
      "                    Average Cost of Supply\n",
      "  3 $40                                                                                                                       GKA        GWA\n",
      "                                                                                                                      GPA     WNS\n",
      "      $30                                                                                                             EMENA\n",
      "  3                                                                                                                              Norway\n",
      "  8   $20\n",
      "  E                                                                                                                   Qatar      Libya\n",
      "                                                                                                                      Asia Pacific Canada\n",
      "      $10                                                                          Permian\n",
      "                                                                                                                      APLNG        Montney\n",
      "       S0\n",
      "                                               10              15              20                         Bakken\n",
      "                                         Resource (BBOE)                           Eagle Ford             Other       Malaysia ChinaSurmont\n",
      "                 Lower 48      Canada       Alaska      EMENA    Asia Pacific\n",
      "Costs assumemid-cycle price environment of S60/BBL WTI:\n",
      "                                                                                                                                           ConocoPhillips\n"
     ]
    }
   ],
   "source": [
    "print(response.source_nodes[7].get_content(metadata_mode=\"all\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d21d694b-6618-4d04-a6f6-8b0c2625f539",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ConocoPhillips has a diverse production base spanning several key geographies:\n",
      "\n",
      "1. **Lower 48 (United States)**: This includes major production areas such as the Permian Basin, Bakken Formation, and Eagle Ford Shale.\n",
      "2. **Alaska**: Significant operations in the North Slope region.\n",
      "3. **Canada**: Operations in the Montney Formation and the Surmont oil sands project.\n",
      "4. **EMENA (Europe, Middle East, and North Africa)**: Notable operations in Norway and Libya.\n",
      "5. **Asia Pacific**: Includes operations in Malaysia, China, and the Australia Pacific LNG (APLNG) project.\n",
      "6. **Qatar**: Involvement in the country's energy sector.\n",
      "\n",
      "These regions highlight the company's extensive and varied geographical footprint in the energy production industry.\n"
     ]
    }
   ],
   "source": [
    "print(str(base_response))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d3afccae-ad8d-4c5d-9d93-810dba413a5d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Our Differentiated Portfolio: Deep; Durable and Diverse\n",
      "                              20 BBOE of Resource                                           Diverse Production Base\n",
      "                            Under $40/BBL Cost of Supply                              10-Year Plan Cumulative Production (BBOE)\n",
      "      S50                   S32/BBL                                                Lower 48                           Alaska\n",
      "                    Average Cost of Supply\n",
      "  3 $40                                                                                                                       GKA        GWA\n",
      "                                                                                                                      GPA     WNS\n",
      "      $30                                                                                                             EMENA\n",
      "  3                                                                                                                              Norway\n",
      "  8   $20\n",
      "  E                                                                                                                   Qatar      Libya\n",
      "                                                                                                                      Asia Pacific Canada\n",
      "      $10                                                                          Permian\n",
      "                                                                                                                      APLNG        Montney\n",
      "       S0\n",
      "                                               10              15              20                         Bakken\n",
      "                                         Resource (BBOE)                           Eagle Ford             Other       Malaysia ChinaSurmont\n",
      "                 Lower 48      Canada       Alaska      EMENA    Asia Pacific\n",
      "Costs assumemid-cycle price environment of S60/BBL WTI:\n",
      "                                                                                                                                           ConocoPhillips\n"
     ]
    }
   ],
   "source": [
    "print(base_response.source_nodes[1].get_content(metadata_mode=\"all\"))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "llama_index_v3",
   "language": "python",
   "name": "llama_index_v3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
